semantic network
OpenGloss: A Synthetic Encyclopedic Dictionary and Semantic Knowledge Graph
We present OpenGloss, a synthetic encyclopedic dictionary and semantic knowledge graph for English that integrates lexicographic definitions, encyclopedic context, etymological histories, and semantic relationships in a unified resource. OpenGloss contains 537K senses across 150K lexemes, on par with WordNet 3.1 and Open English WordNet, while providing more than four times as many sense definitions. These lexemes include 9.1M semantic edges, 1M usage examples, 3M collocations, and 60M words of encyclopedic content. Generated through a multi-agent procedural generation pipeline with schema-validated LLM outputs and automated quality assurance, the entire resource was produced in under one week for under $1,000. This demonstrates that structured generation can create comprehensive lexical resources at cost and time scales impractical for manual curation, enabling rapid iteration as foundation models improve. The resource addresses gaps in pedagogical applications by providing integrated content -- definitions, examples, collocations, encyclopedias, etymology -- that supports both vocabulary learning and natural language processing tasks. As a synthetically generated resource, OpenGloss reflects both the capabilities and limitations of current foundation models. The dataset is publicly available on Hugging Face under CC-BY 4.0, enabling researchers and educators to build upon and adapt this resource.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (15 more...)
- Law > Intellectual Property & Technology Law (0.68)
- Education > Educational Technology (0.46)
Optimal Foraging in Memory Retrieval: Evaluating Random Walks and Metropolis-Hastings Sampling in Modern Semantic Spaces
Human memory retrieval often resembles ecological foraging where animals search for food in a "patchy" environment. Optimal foraging means strict adherences to the Marginal V alue Thereom (MVT) in which individuals exploit a "patch" of semantically related concepts until it becomes less rewarding, then switch to a new cluster. While human behavioral data suggests foraging-like patterns in semantic fluency tasks, it is still unknown whether modern high-dimensional embedding spaces provide a sufficient representation for algorithms to closely match observed human behavior. By leveraging state-of-the-art embeddings and prior clustering and human semantic fluency data I find that random walks on these semantic embedding spaces produces results consistent with optimal foraging and the MVT. Surprisingly, introducing Metropolis-Hastings, an adaptive algorithm expected to model strategic acceptance and rejection of new clusters, does not produce results consistent with observed human behavior. These findings challenge the assumption that sophisticated sampling mechanisms inherently provide better cognitive models of memory retrieval. Instead, they highlight that appropriately structured semantic embeddings, even with minimalist sampling approaches, can produce near-optimal foraging dynamics. In doing so, my results support the perspective of Hills (2012) rather than Abbott (2015), demonstrating that modern embed-dings can approximate human memory foraging without relying on complex acceptance criteria.
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.95)
HarmNet: A Framework for Adaptive Multi-Turn Jailbreak Attacks on Large Language Models
Narula, Sidhant, Asl, Javad Rafiei, Ghasemigol, Mohammad, Blanco, Eduardo, Takabi, Daniel
Abstract--Large Language Models (LLMs) remain vulnerable to multi-turn jailbreak attacks. We introduce HarmNet, a modular framework comprising ThoughtNet, a hierarchical semantic network; a feedback-driven Simulator for iterative query refinement; and a Network Traverser for real-time adaptive attack execution. HarmNet systematically explores and refines the adversarial space to uncover stealthy, high-success attack paths. Experiments across closed-source and open-source LLMs demonstrate that HarmNet outperforms state-of-the-art methods, achieving significantly higher attack success rates. For example, on Mistral-7B, HarmNet achieves a 99.4% attack success rate--13.9%
Distributional Semantics Tracing: A Framework for Explaining Hallucinations in Large Language Models
Bhatia, Gagan, Sripada, Somayajulu G, Allan, Kevin, Azcona, Jacobo
Large Language Models (LLMs) are prone to hallucination, the generation of plausible yet factually incorrect statements. This work investigates the intrinsic, architectural origins of this failure mode through three primary contributions. First, to enable the reliable tracing of internal semantic failures, we propose Distributional Semantics Tracing (DST), a unified framework that integrates established interpretability techniques to produce a causal map of a model's reasoning, treating meaning as a function of context (distributional semantics). Second, we pinpoint the model's layer at which a hallucination becomes inevitable, identifying a specific commitment layer where a model's internal representations irreversibly diverge from factuality. Third, we identify the underlying mechanism for these failures. We observe a conflict between distinct computational pathways, which we interpret using the lens of dual-process theory: a fast, heuristic associative pathway (akin to System 1) and a slow, deliberate, contextual pathway (akin to System 2), leading to predictable failure modes such as Reasoning Shortcut Hijacks. Our framework's ability to quantify the coherence of the contextual pathway reveals a strong negative correlation ($ρ= -0.863$) with hallucination rates, implying that these failures are predictable consequences of internal semantic weakness. The result is a mechanistic account of how, when, and why hallucinations occur within the Transformer architecture.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Oceania > Australia > Australian Capital Territory > Canberra (0.04)
- Asia > Middle East > Jordan (0.04)
The role of media memorability in facilitating startups' access to venture capital funding
Toschi, L., Torrisi, S., Colladon, A. Fronzetti
Media reputation plays an important role in attracting venture capital investment. However, prior research has focused too narrowly on general media exposure, limiting our understanding of how media truly influences funding decisions. As informed decision-makers, venture capitalists respond to more nuanced aspects of media content. We introduce the concept of media memorability - the media's ability to imprint a startup's name in the memory of relevant investors. Using data from 197 UK startups in the micro and nanotechnology sector (funded between 1995 and 2004), we show that media memorability significantly influences investment outcomes. Our findings suggest that venture capitalists rely on detailed cues such as a startup's distinctiveness and connectivity within news semantic networks. This contributes to research on entrepreneurial finance and media legitimation. In practice, startups should go beyond frequent media mentions to strengthen brand memorability through more targeted, meaningful coverage highlighting their uniqueness and relevance within the broader industry conversation.
- Europe > United Kingdom > England > Tyne and Wear > Newcastle (0.04)
- Asia > China (0.04)
- North America > United States > California > Santa Clara County > Mountain View (0.04)
- (12 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
SANDWiCH: Semantical Analysis of Neighbours for Disambiguating Words in Context ad Hoc
Guzman-Olivares, Daniel, Quijano-Sanchez, Lara, Liberatore, Federico
The rise of generative chat-based Large Language Models (LLMs) over the past two years has spurred a race to develop systems that promise near-human conversational and reasoning experiences. However, recent studies indicate that the language understanding offered by these models remains limited and far from human-like performance, particularly in grasping the contextual meanings of words, an essential aspect of reasoning. In this paper, we present a simple yet computationally efficient framework for multilingual Word Sense Disambiguation (WSD). Our approach reframes the WSD task as a cluster discrimination analysis over a semantic network refined from BabelNet using group algebra. We validate our methodology across multiple WSD benchmarks, achieving a new state of the art for all languages and tasks, as well as in individual assessments by part of speech. Notably, our model significantly surpasses the performance of current alternatives, even in low-resource languages, while reducing the parameter count by 72%.
- Europe > Spain (0.46)
- Asia > China (0.28)
- North America > United States > Colorado (0.14)
- (15 more...)
A New Approach for Knowledge Generation Using Active Inference
Ghasimi, Jamshid, Movarraei, Nazanin
There are various models proposed on how knowledge is generated in the human brain including the semantic networks model. Although this model has been widely studied and even computational models are presented, but, due to various limits and inefficiencies in the generation of different types of knowledge, its application is limited to semantic knowledge because of has been formed according to semantic memory and declarative knowledge and has many limits in explaining various procedural and conditional knowledge. Given the importance of providing an appropriate model for knowledge generation, especially in the areas of improving human cognitive functions or building intelligent machines, improving existing models in knowledge generation or providing more comprehensive models is of great importance. In the current study, based on the free energy principle of the brain, is the researchers proposed a model for generating three types of declarative, procedural, and conditional knowledge. While explaining different types of knowledge, this model is capable to compute and generate concepts from stimuli based on probabilistic mathematics and the action-perception process (active inference). The proposed model is unsupervised learning that can update itself using a combination of different stimuli as a generative model can generate new concepts of unsupervised received stimuli. In this model, the active inference process is used in the generation of procedural and conditional knowledge and the perception process is used to generate declarative knowledge.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
- Oceania > Australia (0.04)
- (5 more...)
- Health & Medicine > Therapeutic Area > Neurology (1.00)
- Health & Medicine > Consumer Health (0.87)
Dynamic semantic networks for exploration of creative thinking
Georgiev, Danko D., Georgiev, Georgi V.
Human creativity originates from brain cortical networks that are specialized in idea generation, processing, and evaluation. The concurrent verbalization of our inner thoughts during the execution of a design task enables the use of dynamic semantic networks as a tool for investigating, evaluating, and monitoring creative thought. The primary advantage of using lexical databases such as WordNet for reproducible information-theoretic quantification of convergence or divergence of design ideas in creative problem solving is the simultaneous handling of both words and meanings, which enables interpretation of the constructed dynamic semantic networks in terms of underlying functionally active brain cortical regions involved in concept comprehension and production. In this study, the quantitative dynamics of semantic measures computed with a moving time window is investigated empirically in the DTRS10 dataset with design review conversations and detected divergent thinking is shown to predict success of design ideas. Thus, dynamic semantic networks present an opportunity for real-time computer-assisted detection of critical events during creative problem solving, with the goal of employing this knowledge to artificially augment human creativity.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > New York (0.04)
- Europe > Finland > Northern Ostrobothnia > Oulu (0.04)
- (19 more...)
- Research Report > New Finding (0.66)
- Research Report > Promising Solution (0.46)
Algorithm for Semantic Network Generation from Texts of Low Resource Languages Such as Kiswahili
Wanjawa, Barack Wamkaya, Muchemi, Lawrence, Miriti, Evans
Box 30197 Nairobi 00100, Kenya eamiriti@uonbi.ac.ke Abstract Processing low-resource languages, such as Kiswahili, using machine learning is difficult due to lack of adequate training data. However, such low-resource languages are still important for human communication and are already in daily use and users need practical machine processing tasks such as summarization, disambiguation and even question answering (QA). One method of processing such languages, while bypassing the need for training data, is the use semantic networks. Some low resource languages, such as Kiswahili, are of the subject-verb-object (SVO) structure, and similarly semantic networks are a triple of subject-predicate-object, hence SVO parts of speech tags can map into a semantic network triple. An algorithm to process raw natural language text and map it into a semantic network is therefore necessary and desirable in structuring low resource languages texts. This algorithm tested on the Kiswahili QA task with upto 78.6% exact match. Highlights Languages, both low and high-resource are important for communication. Low resource languages lack vast data repositories necessary for machine learning. Use of language part of speech tags can create meaning from the language. An algorithm can create semantic networks out of the language parts of speech. The semantic network of the language can do practical tasks such as QA.
- Africa > Kenya > Nairobi City County > Nairobi (0.25)
- North America > United States (0.14)
- Oceania > Australia (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Measuring individual semantic networks: A simulation study
Aeschbach, Samuel, Mata, Rui, Wulff, Dirk U.
Accurately capturing individual differences in semantic networks is fundamental to advancing our mechanistic understanding of semantic memory. Past empirical attempts to construct individual-level semantic networks from behavioral paradigms may be limited by data constraints. To assess these limitations and propose improved designs for the measurement of individual semantic networks, we conducted a recovery simulation investigating the psychometric properties underlying estimates of individual semantic networks obtained from two different behavioral paradigms: free associations and relatedness judgment tasks. Our results show that successful inference of semantic networks is achievable, but they also highlight critical challenges. Estimates of absolute network characteristics are severely biased, such that comparisons between behavioral paradigms and different design configurations are often not meaningful. However, comparisons within a given paradigm and design configuration can be accurate and generalizable when based on designs with moderate numbers of cues, moderate numbers of responses, and cue sets including diverse words. Ultimately, our results provide insights that help evaluate past findings on the structure of semantic networks and design new studies capable of more reliably revealing individual differences in semantic networks.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Switzerland > Basel-City > Basel (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- (2 more...)